refactor(hummock): add owned from_persisted_protobuf to reduce memory amplification#24940
refactor(hummock): add owned from_persisted_protobuf to reduce memory amplification#24940
Conversation
There was a problem hiding this comment.
Pull request overview
Refactors Hummock protobuf deserialization paths to support owned conversions, aiming to reduce memory amplification by moving large protobuf fields (e.g., SSTable key ranges / table_ids) instead of cloning during checkpoint/version loading.
Changes:
- Add owned
from_persisted_protobuf_owned/from_protobuf_ownedconversion APIs for Hummock version/delta/checkpoint-related types. - Implement
From<PbHummockVersion>forHummockVersionCommon<T>and addTableChangeLogCommon::from_protobuf_ownedto enable move-based conversions. - Update call sites (meta snapshot load/build paths, time travel, tests, benches) to use owned conversion APIs when protobuf values are already owned/temporary.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/storage/src/hummock/store/version.rs | Update test to construct HummockVersion via owned persisted-protobuf conversion. |
| src/storage/src/hummock/event_handler/uploader/test_utils.rs | Update test helper to use owned persisted-protobuf conversion. |
| src/storage/hummock_sdk/src/version.rs | Add owned persisted-protobuf conversion entrypoints and owned From<PbHummockVersion> conversion. |
| src/storage/hummock_sdk/src/change_log.rs | Add owned change log conversion helper (from_protobuf_owned). |
| src/storage/benches/bench_table_watermarks.rs | Switch benchmark setup to owned persisted-protobuf conversion. |
| src/storage/backup/src/meta_snapshot_v2.rs | Decode snapshot metadata using owned HummockVersion conversion. |
| src/storage/backup/src/meta_snapshot_v1.rs | Decode snapshot metadata using owned HummockVersion conversion. |
| src/meta/src/hummock/manager/time_travel.rs | Use owned persisted-protobuf conversions when to_protobuf() already yields owned protobuf values. |
| src/meta/src/hummock/manager/mod.rs | Use owned persisted-protobuf conversion for deltas loaded from DB models. |
| src/meta/src/hummock/manager/checkpoint.rs | Add owned checkpoint conversion and use it when reading checkpoints from object store. |
| src/meta/src/backup_restore/meta_snapshot_builder.rs | Convert version deltas using the owned persisted-protobuf API. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /// Convert an owned `PbHummockVersionDelta` deserialized from persisted state to | ||
| /// `HummockVersionDelta`, moving data instead of cloning. | ||
| pub fn from_persisted_protobuf_owned(delta: PbHummockVersionDelta) -> Self { | ||
| delta.into() | ||
| } |
There was a problem hiding this comment.
from_persisted_protobuf_owned currently just calls delta.into(), but the underlying From<PbHummockVersionDelta> conversion still clones change_log_delta (log_delta.new_log.clone() + iterating by reference). That means the new owned API won’t fully eliminate the intended cloning/memory amplification for deltas. Consider changing the From<PbHummockVersionDelta> impl to consume pb_version_delta.change_log_delta via into_iter() and convert each PbChangeLogDelta with its owned From<PbChangeLogDelta> impl (in change_log.rs), avoiding the clone() entirely.
… amplification - Add From<PbHummockVersion> for HummockVersionCommon<T> (owned) to move data instead of cloning during deserialization - Add from_persisted_protobuf_owned and from_protobuf_owned methods for HummockVersion, HummockVersionDelta, and HummockVersionCheckpoint - Add TableChangeLogCommon::from_protobuf_owned for owned change log conversion - Update all callers that pass owned/temporary protobuf values to use the new owned variants, eliminating unnecessary clones of key_range, table_ids, and stale_objects across thousands of SstableInfos
0701220 to
b4a42fd
Compare
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
Checklist
Documentation
Release note